Columns F and G show the combined total number of individuals at risk and the total number of
individuals who died, which is obtained by combining the corresponding columns for the two
groups.
Column H, labeled % At Risk, shows Group 1’s percentage of the total number of at-risk
individuals per time slice.
Column I, labeled Expected Deaths, shows the number of deaths you’d expect to see in Group 1
based on apportioning the total number of deaths (in both groups) by Group 1’s percentage of total
individuals at-risk. For the 0–1 day row, Group 1 had about
of the 89 individuals at risk, so
you’d expect it to have about
of the nine deaths.
Column J, labeled Excess Deaths, shows the excess number of actual deaths compared to the
expected number for Group 1.
Column K shows the variance (equal to the square of the standard deviation) of the excess deaths.
It’s obtained from this complicated formula that’s based on the properties of the binomial
distribution (see Chapter 24):
For the first time slice (0–1 day), this becomes:
,
which equals approximately 1.813.
N refers to the number of individuals at risk, D refers to deaths, the subscripts 1 and 2 refer to
groups 1 and 2, and T refers to the total of both groups combined.
Next, you add up the excess deaths in all the time slices to get the total number of excess deaths for
Group 1 compared to what you would have expected if the deaths had been distributed between the
two groups in the same ratio as the number of at-risk individuals.
Then you add up all the variances. You are allowed to do that, because the sum of the variances of the
individual numbers is equal to the variance of the sum of a set of numbers.
Finally, you divide the total excess deaths by the square root of the total variance to get a test statistic
called Z:
The Z value is approximately normally distributed, so you can obtain a p value from a table of the
normal distribution or from an online calculator. For the data in Figure 23-3,
, which
is 2.19. This z value corresponds to a p value of 0.028, so the null hypothesis is rejected, and you can
conclude that the two groups have a statistically significantly different survival curve.
Note: By the way, it doesn’t matter which group you assign as Group 1 in these calculations. The final
results come out the same either way.
Assessing the assumptions
Like all statistical tests, the log-rank test assumes that you studied an unbiased sample from the
population about which you’re trying to draw conclusions. It also assumes that any censoring that
occurred was due to circumstances unrelated to the treatment being tested (for example, individuals
didn’t drop out of the study because the drug made them sick).
Also, the log-rank test looks for differences in overall survival time. In other words, it’s not good at